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Abstract 

Maximum distance separable (MDS) codes are widely used in storage systems to protect against disk (node) failures. A node 
is said to have capacity I over some field F, if it can store that amount of symbols of the field. An (n,k, I) MDS code uses 11 nodes 
of capacity / to store k information nodes. The MDS property guarantees the resiliency to any n — k node failures. An optimal 
C^) ' bandwidth (resp. optimal access) MDS code communicates (resp. accesses) the minimum amount of data during the repair process 

of a single failed node. It was shown that this amount equals a fraction of l/(« — k) of data stored in each node. In previous 
optimal bandwidth constructions, I scaled polynomially with k in codes with asymptotic rate < 1. Moreover, in constructions 
£N| ' with a constant number of parities, i.e. rate approaches 1, I is scaled exponentially w.r.t. k. In this paper, we focus on the later 

» | | case of constant number of parities n — k = r, and ask the following question: Given the capacity of a node / what is the largest 

number of information disks k in an optimal bandwidth (resp. access) (k + r,k,l) MDS code. We give an upper bound for the 
general case, and two tight bounds in the special cases of two important families of codes. Moreover, the bounds show that in 
some cases optimal-bandwidth code has larger k than optimal-access code, and therefore these two measures are not equivalent. 
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I. Introduction 



Erasure-correcting codes are the basis for widely used storage systems, where disks (nodes) correspond to symbols in the 
code. An important family of codes is the Maximum distance separable (MDS) codes, which provide an optimal resiliency to 
^ . erasures for a given amount of redundancy. Namely, an MDS code with r redundancy (parity) symbols can repair the information 
O ' from any r symbol erasures. Because of this storage efficiency, MDS codes are highly favorable, and a lot of research has 
been done to construct them. Examples of MDS codes are the well known Reed Solomon codes, EVENODD fT|, (2, B-code 
7—1 ' l24l . X-code l25l . RDP |7), and STAR-code (9J- It is evident that in the case of r erasures, one needs to communicate all the 
J> . surviving information during the repair process. However, although the MDS codes used in practice are resilient to more than 
' a single erasure, i.e. number of parity nodes r > 1, the practical and more interesting question is; what is the minimum repair 
, bandwidth in a single node erasure. The repair bandwidth is defined as the amount of information communicated during the 
£^ ' repair process. This question has received much interest recently due to both its practical and theoretical importance. From a 
• . practical viewpoint, decreasing the repair bandwidth shortens both the repair process and the inaccessibility time of the erased 
£^ ' information. Moreover, from a theoretical perspective, this question has deep connections to the widely used interference 
, alignment technique and network coding. 

' A. The Problem 

The problem of efficient repair was defined by Dimakis et al. in |8j. It considers a file of size A4 symbols, divided into 
5_j \ k equally sized chunks stored using an (n,k,l) MDS code over the finite field F, where n is the number of nodes, each of 
capacity I = %^rpp] ■ Namely, each node can store up to I symbols and each symbol corresponds to log |F| bits. The first k 
nodes, which are referred to as the systematic nodes, store the raw information. The later r = n — k nodes are the parity nodes 
which store a function of the raw information. Since the code is MDS, it can tolerate any loss of up to r nodes. However, the 
more common scenario is the failure (erasure) of only one node. (8) proved that 

n — k ^ 

is a lower bound on the repair bandwidth for an (n,k,l) MDS code. For example, in a code with r = 2 parities, each of the 
n — 1 surviving nodes needs to communicate during the repair process, on the average at least 1/2 symbols, which is equal 
to one half of the node's capacity. Note that repair is possible since the code is resilient to more than one erasure, and a 
repair strategy of communicating the entire remaining information suffices. An MDS code is termed optimal bandwidth if it 
achieves the lower bound in ([TJ during the repair process of any of its systematic nodefl Figure Q] shows an optimal bandwidth 
(6,4,2) MDS code. For repairing an erased node, one symbol of information is transmitted to the repair center from each 
surviving node. In some applications such as data centers, reading (accessing) the information is more costly than transmitting 

The material in this paper was presented in part at the IEEE International Symposium on Information Theory (ISIT 2012), Cambridge, MA, USA, July 
2012. 

'The relaxed requirement of optimal repair only for the systematic nodes is reasonable, because the number of parity nodes in most storage systems is 
negligible compared to systematic nodes. Moreover, in an erasure of a systematic node, the raw information is not accessible as opposed to a parity node 
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Figurel. An(6,4,2) MDS code with optimal bandwidth over the field F7. Nodes N1,N2,N3,NA are systematic and the last 2 nodes are parity nodes. For 
repairing node Nl, (resp. NZ) transmit the first (second) row from each surviving node. For repairing node N3 transmit from each surviving node the sum 
of its two elements . For repairing node N4 transmit the sum of the first row and twice the second row from Parity 2, and the sum of the first row and four 
times the second row from the rest. Notice that this code can be converted to be over the field of size 4, i.e. an (6,4,2) MDS code with optimal bandwidth 
over the field F 2 2 
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Figure 2. Summary of known results on the maximum number of information nodes k in an (k + r, k, 1) MDS code. The derived upper bounds apply for 
codes with constant repairing subspaces. The upper bounds in the general case (not necessarily constant repairing subspaces) are at most greater by one than 
the bounds presented in the table, /indicates a tight bound, * indicates a new upper bound. The references refer to previously known lower bounds 

it. Therefore during a repair process, the need to transmit data that is a function of a large portion of the information stored 
within a node, can cause a bottleneck. For example, node Nl needs to access its entire stored information, for it to calculate 
a + w , during the repair process of node A/3. Therefore, in a large scale storage systems, one might need to minimize not 
only the amount of information transmitted but also the number of accessed information elements. An optimal access MDS 
code is an optimal bandwidth code that transmits only the elements it accesses. By definition, any optimal access code is also 
an optimal bandwidth code. The shortened code restricted to nodes {N\, N2, Parity 1, Parity 2} in Figure Q] is an example of 
an optimal access (4,2,2) MDS code. In fT31l a similar scheme termed repair by transfer was considered. In this scheme an 
exact repair of a lost node is performed by mere transmission of information, without any calculation in any of the surviving 
nodes or at the repair center. 

In a value's update of a stored element, one needs to update each parity node at least once. To avoid an overload on the 
system during a frequent operation such as updating, one needs to design an optimal update code, that updates exactly once 
in each parity node, when an element changes its value. For example in Figure [TJ the shortened code restricted to nodes 
{N3, N4, Parity 1, Parity 2} is an optimal update and optimal bandwidth (4,2,2) MDS code, because updating any of the 
elements c, d, y, z will require updating exactly one element in each of the parity nodes. 

Various codes J5), 0, |[T2l - |[T4l . |[T6l . l2T1 - ll23l were constructed with the goal of achieving optimal bandwidth, however 
these constructions all have low rate, i.e., k/n ^ 1/2. In fl4l . Ifl6l . l22l the key idea was using vector coding. Namely, each 
symbol in a codeword is a vector and not scalar as in "standard" codes. Specifically iTBl . lfT6l constructed optimal bandwidth 
(2k,k,k) MDS codes. Using interference alignment, it was shown in J6) that the bound in (fl} is asymptotically achievable 
also for high rate codes (k/n ^ 1/2) . The question of existence of optimal bandwidth codes with high rate was resolved 
in several constructions 0, |@], ifTOl . iFFD . lfl7l - |[T9l . The constructions have an arbitrary number of parity nodes r, however 
when r is constant, i.e. rate approaching 1 in all of the constructions k = 0(log r /), i.e., the capacity / scales exponentially 
with the number of systematic nodes k. 

B. Our Contribution 

Our main goal in this paper is to understand the relation between / the capacity of each node, and the number of systematic 
nodes k. More precisely, given the capacity of the node /, what is the largest number of systematic nodes k, such that there 
exists an optimal bandwidth or optimal access (k + r,k,l) MDS code, for some constant r. We will derive three upper bounds 
on the number of nodes A: as a function of only I, for different families of codes. We emphasize that we consider only linear 
codes, and the bounds apply for this case only. To derive the bounds, we use three different combinatorial techniques. The first 
bound considers the general problem, where no requirements on the MDS code are imposed except the optimal bandwidth 
property. The bound is derived by defining an appropriate set of multivariate polynomials. We proceed by deriving a tight bound 
for optimal bandwidth MDS codes with diagonal encoding matrices. These codes are a part of an important family of codes 
with an optimal update property. The last result provides a tight bound on optimal access MDS codes. Table [2] summarizes 
the known results together with our new results. 

For constant r, all the previous optimal-bandwidth constructions J3J, J4), iflOl . ifTTI . lfT71l - lfT9l are indeed either optimal- 
access codes or equivalent to optimal-access codes. Therefore, it is not obvious whether there can be any difference between 
these two kinds of optimality. From the second row of Table |2] we discovered that for fixed I and r, the maximum possible 
number of systematic nodes are not the same for an optimal-bandwidth and an optimal-access code. That is to say, these two 
criteria of optimality are not equivalent when a code is non-optimal update. 

2 The result we present considers a special case of optimal update code, where the encoding matrices are diagonal. 
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An example of the size of a practical code can be as follows. In today's current technology the size of an ordinary disk in 
large storage systems is approximately 1TB = 2 40 bits. Hence, each node stores at most 2 40 symbols. Applying for example 
the upper bound in the table for optimal access codes we get that there are at most 2 • log 2 40 = 80 nodes in the system. 

The remainder of the paper is organized as follows. Section |II]presents the settings of the problem and some notation. Section 
IrTTl provides an upper bound for the most general case, i.e., an MDS code with optimal bandwidth property. We proceed in 
Section HVl where a bound is derived for codes with diagonal encoding matrices. In Section IV1 a bound for codes with optimal 
access property is derived. We conclude with a summary in Section [VJ. 



II. Settings and Notation 

Consider a file of size A4 = kl, divided into k nodes of capacity I over the field F, namely each node can store up to 1 
elements of that field. Each systematic node 1 ^ x ^ k is represented by an / x 1 vector a, G F'. Interchangeably, we will refer 
to a matrix S and the subspace spanned by its rows as the same mathematical object, therefore 

rank(S) = dim(S). 

Moreover, whenever we write an equality between two matrices we mean to an equality between the subspaces spanned by 
their rows. For any integer r an (k + r, k, I) MDS code is constructed by adding parity nodes k + 1, k + r, which will give 
the resiliency to node erasures. Parity node k + i for i G {1, r} stores the information vector of length I over F, and 
is defined as 



a k+i 



Here the Q^'s are invertible matrices of order I, which are called the encoding matrices. Note that the code has a systematic 
structure, i.e., the first k nodes store the information itself, and not a function of it. Therefore, the code is uniquely defined by 
the matrix 

Ql,i ■•■ Q,/c 

C = (Q,/)ie [r],je 



c 



rjc 



(2) 



The code is called an MDS if it can repair any r node erasures, which is equivalent to the statement that any 1 x 1, 2 x 2, r x r 
block sub matrix in (O is invertible. Consider a scenario of a single erasure of a systematic node m, 1 $J m ^ k. In order to 
optimally repair the lost data, a linear combination of the information stored in the parity nodes is transmitted to the erased 
node. Namely, parity nodes k + 1, k + r, project their data on the repairing subspaces S\ im , Si im , S r , m of dimension I / r 
each, respectively. During the repair process of systematic node m G [k], parity node k + i transmits the information 



7=1 



The only information about the lost systematic node m received by parity node k + i is S^ m C, , n fl m . Note that the other surviving 
systematic nodes do not contain any information about the lost node. Therefore a necessary condition for repairing the lost 
information of systematic node m is 

^ > \,r>S--\,m 



rank 



I, 



(3) 



i.e., the matrix is invertible. This condition is equivalent to that the subspaces Si m A\ m , S rim A Vim form a direct sum of F', 
namely 

®i£ [r] S;',mQ,m = ^ ■ (4) 



However the transmitted information from the parities contains interference (information) from the other surviving nodes. The 
interference of node m' m received from parity node k + i is S, m Q 



Systematic node m' transmits to the repair center 
enough information in order to cancel out the this interference. In total, the information that needs to be transmitted from node 
m' is 



Si, m C 



X,m' 



(5) 
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Hence the amount of information transmitted is equivalent to the rank of the matrix in (0. The rank of the matrix S\ m Q m i is 
1 1 r, therefore the rank of the whole matrix is at least / / r. Thus the code is optimal bandwidth only if we transmit the smallest 
amount of information, i.e. for any m' 7^ m 

S\,m ML 



rank 



1 



Which is equivalent to the equality between the subspaces 



"1 



l,m' 



^2,m^2,m' 



Sr,mCr,m'- 



(6) 



(7) 



We conclude that an optimal bandwidth algorithm for the systematic nodes is defined by the set of repairing subspaces 
(Si m; S,-, m ) that satisfy (0 and (0 for 1 ^ m ^ /c0 However, it will be more convenient to assume that the repairing 
subspaces are constant, namely to repair systematic node m we use the same repairing subspace S m for each of the r parities. 
In other words, the information transmitted from parity node k + i is S m a^ +i . From Combining equations (0, (0 we get the 
following corollary. 

Corollary 1 The code defined in (01 is optimal bandwidth with constant repairing subspaces if there exist subspaces S\, 
each of dimension I / r, such that for any m £ [k] 



rank 



SmC 



mM,m' 



SmC r 1 




(8) 



The following remarks apply for codes with constant repairing subspaces. 
Remarks: 

1) Without loss of generality we will always assume that the last row in the encoding matrix C in (0 is composed of 
only identity matrices, i.e., C r ,m = I for any m £ [k]. Because if C = (Cij),i £ [r],j £ [k] defines an optimal bandwidth 
code, let = Q^C", 1 . Then C = (C^-),z£ [r],/G [k] with the same sets of repairing subspaces, defines an optimal 
bandwidth code, and C' r m is the identity matrix for any m £ [k] . 

Since the dimension of each subspace S m is I /r, and any encoding matrix C £ {C;y} is invertible, then dim(S m C) —l/r. 
Hence the rank of the matrix in (0, which is composed of r block matrices, has two extreme cases for its possible value. 
For m = m' the rank is maximal, i.e. the matrix is invertible. For m 7^ m' the rank has the minimum possible value of 
/ / r. Note also that in this case, for any i £ [r] 

SmCi nl ' = S m . (9) 



2) 



3) 



Namely S m is an invariant subspace for any matrix Q „ 
identity matrix according to the previous remark. 
For m' = m (0 is equivalent to 



when m' 7^ m. This follows since C r m i is assumed to be the 



: [r] S m Cj /n 



(10) 



The next theorem shows that from any optimal bandwidth MDS code we can construct another optimal bandwidth MDS 
code with constant repairing subspaces, and almost the same parameters. 

Theorem 2 If there exists an optimal bandwidth (k + r,k,l) MDS code, then there exists an optimal bandwidth (k + r — 1, k — 
1,1) MDS code with constant repairing subspaces. 

The proof is shown in Appendix lAl 

From the last theorem we get the following corollary. 

Corollary 3 Let k be the largest number of systematic nodes in an optimal bandwidth (k + r,k,l) MDS code. Let s be the largest 
number of systematic nodes in an optimal bandwidth (s + r, s,l) MDS code with constant repairing subspaces, then s ^ k s + 1. 



Proof: It is clear that s ^k. From Theorem we conclude that k — 1 ^ s. ■ 
Theorem shows that the difference between the maximum number of nodes k in an optimal bandwidth MDS codes with 

or without constant repairing subspaces is negligible (at most 1). Therefore in the sequel we will always assume that the codes 

have constant repairing subspaces, and the bounds will apply for this case. 

For any two integers i < j denote by [2] = {1, ...,/} and = {i, i + \, ...,]}. For simplicity, we will assume that the 

capacity of each node Z, is a power of r. In the next section we present our first bound which applies for the most general 

case. 

3 We point out that similar conditions were derived also in 1141 . 
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III. Upper bound on the number of nodes in an optimal bandwidth MDS code 

We start with the most general problem, which seems to be the most difficult. No constraints on the encoding matrices and 
the repairing subspaces are imposed. We derive an upper bound on the number of information nodes k in an optimal bandwidth 
(k + r, k, I) MDS code for arbitrary number of parities r. The bound is a function of only the capacity / of the node, regardless 
of the field size being used. 

Before we prove the upper bound, for a set of indices I, J define Bjj to be the sub matrix of B restricted to rows / and 
columns /. 

Theorem 4 Let C = (Q ,•) be an (k + r,k,l) optimal bandwidth MDS code with constant repairing subspaces Si,..., S; c then 




Proof: By the optimal bandwidth property, for any m E [k] the matrix 



(11) 



SmCr,m 



is of full rank. Here S m is a matrix of dimension - x I. Hence there exists a set of indices J C [I] of size 1 + 1 such that the 
(| + 1) x (| + 1) sub matrix restricted to rows [l(r — l)/r,l] and columns J, is invertible. Namely, 



det 



: #0. 

V S m C r ,m ) [iZ^l,/],! 



Moreover, since for any m' 7^ m, 



rank 



Sm C« yfll 

the sub matrix restricted to the same set of rows and columns is not of full rank, (note that for distinct wt's the set of indices 
J might be different). Hence, for each m 6 [k] the polynomial f m : Fr x ' — > F, defined by, 

SCi fft \ 

(12) 

SC r , m J ptzi^j 



satisfies, 



I ^ m — w! 

m{S m ,) = \ i ~ . (13) 

otherwise. 



We claim that the f m 's are linearly independent multivariate polynomials. Assume that for some a m 's 6 F 

&mfm = 0/ 

m 

where is the zero polynomial. Assume by contradiction that a.j 7^ for some j, but 

= 0(Sj) 

m 

= KjfjiSj) £ 0, 

and we get a contradiction. Therefore the polynomials are linearly independent. Define two sets of polynomials 

■ ■ ■ x l,l 



r. {dot I ; ■. ; : / • 
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and T 2 = {xi/yi : 1 ^ i ^ I}, where (JJj.) is the set of Z/r-subsets of [I], Note that each element in the l(r — l)/r-th row 
of ( fTTT > is a linear combination of the indeterminates X\/ r \, X\i r \ in the last row. In addition, recall that C, v „ is the identity 
matrix and S m C T/m = S m . Hence, by expanding the determinant in (fTSl i by the l(r — l)/r-th row, we conclude that it is a 
linear combination of the polynomials from 

T 1 -T 2 = {h-g:heT 1 ,geT 2 }. 

Namely, {f m } Q span(Tj ■ T 2 ). However, since the f m 's are linearly independent, the number of polynomials is at most the 
dimension, i.e., 

*=|{/m}| 
< dim(span(T 1 ■ T 2 )) 

^ ■ \T 2 \ 
I 



l/r 



Corollary 5 Let k be the largest number of systematic nodes in an optimal bandwidth (k + r,k,l) MDS code, then 

(r + l)log,.ZO^n 

Proof: The lower bound is given by the code constructed in 

As one can notice, there exists a big gap between the upper and the lower bound. We conjecture that the lower bound is 
more accurate, and in fact k = 8 (log I). 

We proceed by giving a tight bound for the number of systematic nodes k in the case where all the encoding matrices are 
diagonal. 

IV. Upper bound for Diagonal Encoding Matrices 

One of the most common operation in the maintenance of a storage system is updating. Namely, a certain element has 
changed its value, and that needs to be updated in the system. Since the code is an MDS, each parity node is a function of 
the entire information stored in the system. Therefore, in a single update, each parity node needs to be updated at least in one 
of the elements it stores. An optimal update code is one that needs to update each parity node exactly once in an update of 
any information element. Namely, an optimal update code updates the minimum number of times in any value change. Since 
updating is a highly frequent operation, a storage system with the optimal update property has a huge advantage. A reasonable 
question to answer is what can be said on systems that posses both the optimal access/bandwidth and optimal update properties. 
In this section we derive a tight bound on the number of information disks for these systems. However the derived bound 
applies only for a special case of an optimal update code, where all the encoding matrices are diagonal. Note that in Theorem 
I2 if the code is composed of diagonal encoding matrices, then in the theorem, the constructed code with constant repairing 
subspaces will also be composed of diagonal matrices. Therefore Corollary [3] applies also to codes with diagonal matrices. 

We begin with a simple lemma on the entropy function. 

Lemma 6 Let X be a random variable such that for any possible outcome x , P (X = x) ^ ~, then its entropy satisfies H r (X) }z 1, 
where H r ( • ) is the entropy function calculated in base r. 

Proof: Since P(X) } then log^^) ^ 1 and 

H r (X) = E(log r (-L_))>l. 

■ 

Next we make a few definitions. A partition X of some set T is a set of subsets of T such that 
and for any distinct sets X\,x 2 <^X 

X\ n x 2 = 0. 

Moreover, for two partitions X , y, their meet is defined as, 

XAy = {xny :xex,yey}. 
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Note that the meet of two partitions of same set is also a partition. We denote partitions by Calligraphic letters A,B, and 
sets in a partition by lowercase letters, e.g. x E X. For a set of indices x C [/] denote by span(e x ) = span(e; : i E x), where 



e, is the z'-fh vector in the standard basis. 

Since each encoding matrix Q ; is diagonal, the standard basis vectors are its set of eigenvectors, and the entries along the 
diagonal are its eigenvalues. Therefore Q; defines a partition X\ j of [I], by m,n E [I] are in the same set of the partition, iff 
the corresponding standard basis vectors e m and e n have the same eigenvalue in Qy. Let m' E [k] be some node that needs to 
be repaired, and denote by X the meet of the partitions 

^ = ^ie [r\,m^m'^-i,m- 

In addition, let S = S m i be the repair subspace for that node. 

The following lemma shows that S can be decomposed into a direct sum of subspaces, such that each subspace is an invariant 
subspace of all the matrices Cj iTn , i E [r], m 7^ m'. Note that for each x E X and m 7^ m', the subspace span(e x ) is a subspace 
of some eigenspace of Q m . Therefore, span(e T ) and S H span(e x ) are invariant subspaces of Q /m . 

Lemma 7 The repair subspace S of the node m' can be written as 

S = ® X£X S X , (14) 

where S x = S D span(e x ). 

Proof: It is clear that a vector v 7^ is an eigenvector for all the matrices Q m , m 7^ m! iff v E span(e x ), for some set x 
in the partition X. Assume S is represented in its reduced row echelon form, and without loss of generality we assume that 
the first 1 1 r columns of S are linearly independent, hence 

H- 1 A 

Here It is the identity matrix of order f and A is an I/rx l(r — l)/r matrix, and recall that S is an l/r x I matrix. For any 
]E [l/r] let Vj = {£j\dj) be the j-th row of S, where Uj is the j-th row of A. By the optimal bandwidth property, S is an 
invariant subspace of any matrix C, m for any m 7^ m' and i E [r], which are all diagonal matrices. Therefore, we get 

v jQ,m = (^jWj) GS = span(fli, .:Vi/ r ), 

for some non zero a E F and a vector a'j. Namely 

rank ( ~ ) = rank I 7 . \ — l/r. 









) rank ( a) ) 





We claim that = afly, namely (ej\aj) the /-th row of S is an eigenvector of C, m . This follows since since Vj, VjCj /m E S and 

ctVj — VjCj /in = a(ej\cij) — (atej\a'j) = (0\otaj — a'A E S. 

However, the only vector in S with first I / r entries being zero, is the zero vector. Hence we conclude that flj = ctfly, and each 
row vector Vj of S is an eigenvector of Q m for any m 7^ m'. Namely, Vj E span(e v ) for some set x in the partition X, and 
the result follows. ■ 
So far we have looked at X the meet of the partitions X l m ,i E [r],m 7^ m'. Next, we are going to partition each set in X 
using the partitions X^ m i,i E [r], and then upper bound the size of each set in that partition. 

Lemma 8 For x E X denote by V x = x A (A,vt, m i), the partition of x by Xj m i, 1 ^ i ^ r. Then the size of each set in the 
partition V x is at most \x\/r, namely 

\x\ 

max ze p x \z\ ^ — . (15) 

Proof: Assume the contrary that the size of some set z in V x is \z\ > \x\/r. On one hand, for each x E X the subspace 

S x is contained in span(e x ), moreover, span(e T ) is an invariant subspace for C ; m / for any i E [r], since it is a diagonal matrix. 
Therefore 

SxQ,m' ^ s P an ( e x)Q,m' = span(e x ). (16) 

In addition 

= @ i( z [ r ]SQ /jn / (17) 

= ®xeX®i & \ r \SxC iiml . (19) 
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Here ([PTT i follows from (fTUt and (fTFt follows from ([14}. From (fT6l > and (fT9] l we conclude that for any x £ X 

©i 6 W S*Q>' = span(e x ). 



(20) 



Calculating the dimensions in d20l i 



i.e., 



\x\ = dim(span(e^)) 
= dim(® ! - gW S. r C !> /) 

r 

= E dim ( s * c *>') 

1 = 1 

= rdim(S x ), 
dim(S x ) = — . 



(21) 



On the other hand, let a, be the eigenvalue of the matrix C lm i that corresponds to the vectors in span(e z ). W.l.o.g assume 
that z = {1,2,..., |z|}, hence by © 



|x|= rank 



rank 



(22) 



\ S x Cr,m / \ 

Here the last equality in (l22l follows since C r , m is the identity matrix, and the two matrices are row equivalent. However, for 
any iE [r], the first \z\ columns in the diagonal matrix 



are zeros. In addition S x is contained in span(e x ), i.e. the indices of the non zero entries in any vector of S x are contained 
in x. Therefore we get that for any i, 

Sx(C i>m i - ot.il) C span(e x \ z )- 



Hence 



rank 



/ S x {Cx, m i - otil) 



^ dim(span(e A .\ z )) 

= |x| — \z\ 
\x\ 

< \x\ — — , 



(23) 



Therefore we have 



|x| = rank 



^ rank 



( SxCi,m' \ 

\ S v Cr,m ) 
I S x (C hm i -0L\T) 

\ Sx(C r _i rtn i - Ob-il) 



ink(Sx) 



x x 
< x - -L-L + U. 
r r 

= |x|. 



(24) 



Here (fJUi follows from <|23) and (ED, therefore (Q~5) holds. ■ 
Now we are ready to prove the upper bound on the number of systematic nodes. 

Theorem 9 Let C = (Qj) be an (k + r,k,l) optimal bandwidth code composed of diagonal encoding matrices, namely each Cn 
is a diagonal matrix, and constant repairing subspaces Si,..., S^, then k ^ log,. I. 
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Proof: Let j be a random variable that gets any integer value 1,2,...,/ with equal probability. Define for m! S [/c] the 
random variable Y m / to be the set z in the partition A, A", m i that contains /. By ( TOT l we conclude that 

P(Y m , = z|Y m = y m/ m e [k]\{m'}) ^ -, 

r 

for any values of y m , m 6 [Zc]\{wi'}. Hence from Lemma [6] we conclude that the conditional entropy of Y m i satisfies 

H r (Y m> \Y mf me[k}\{m'}) > 1. (25) 

Therefore, 

log, I = H r (/) 

= H,.0-,Y 1/ ...,Y /c ) 

= H,.(Y 1 ,...,Y,)+H,.(/|Y 1 ,... / Y,) 

^H r (Y lr ...,Y k ) 

k 

= £ H r (Y m |Y 1 ,...,Y m _ 1 ) 

m=l 

^ E H r (Y m |Y m/ m^m') (26) 

m=l 
k 

m=l 

where (l26l l follows since conditioning reduces entropy, and d27l i follows from (l25T l. ■ 

Corollary 10 Let A: be the largest number of systematic nodes in an optimal bandwidth (k + r,k,l) MDS code with diagonal 
encoding matrices, then k = log r /. 

Proof: The lower bound is given by the codes constructed in 0, ifTOl . ifTTl . |[T8l . 

■ 

Note that when restricting to diagonal encoding matrices, there is no difference if the code is an optimal access or optimal 
bandwidth in terms of maximum code length k (see Table 13. However, in the next section we show that these two properties 
are not equivalent in the general case. 



V. Upper Bound on the number of nodes for Optimal Access 

Storage systems with optimal bandwidth MDS property introduce high efficiency in data transmission during a repair process. 
However a major bottleneck can still emerge if the transmitted information is a function of a large portion of the data stored 
in each node. In the extreme case the information is a function of the entire information within the node. Namely, in order to 
generate the transmitted data from some surviving node, one has to access and read all the information stored in that node, 
which of course can be an expensive task. An optimal access code is an optimal bandwidth code that transmits only the 
elements it accesses. Namely, the amount of information read is equal to the amount of information transmitted. The property 
of optimal access is equivalent to that each repairing subspace S, is spanned by an Z/r-subset of the standard basis e\,...,e\, 
i.e., Sj = span(e,„ : m G I) for some I an Z/r-subset of [I]. As before, if the code in Theorem [2] is optimal access then the 
constructed code in that theorem will also have the optimal access property. This follows since the set of repairing subspaces 
for the newly constructed code is a subset of the repairing subspaces for the old code. Therefore Corollary [J] applies also to 
optimal access codes. 

We start with an useful lemma that shows that in an optimal access code with constant repairing subspaces, the intersections 
between the subspaces are not large. 

Lemma 11 LetCbean (k + r,k,l) optimal access code with constant repairing subspaces Si,...,Sk, then for any subset of indices 
TC[k] ^ 

dim(n feT S t ) ^ -j^p 

Proof: We prove by induction on the size of T. For \T\ — 1 there is nothing to prove. For |T| = t, w.l.o.g assume that 
T = [t], and denote by S = ^je[t]^i- Assume the contrary that dim(S) > -jj. It is clear by definition that S C Sj for any 
j E [t — 1], hence by ©, for any i E [r — 1] 

SQ,f Q r\j € [ t -i]Sj. 
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We conclude that SQ p SC r ,f are r subspaces of dimension greater than l/r , which are contained in the subspace Dj e [{_j]Sy, 
which by the induction hypothesis is of dimension at most ^j-. Therefore the sum of these subspaces is not a direct sum, 
which contradicts ( TTOb . ■ 

Corollary 12 By the conditions of the previous theorem, the number of repairing subspaces {S,}^ =1 that contain an arbitrary 
vector v 7^ is at most log r I. 

Proof: Let / = {;': V £ S;}, then 

and the result follows. ■ 
The previous Lemma shows that an arbitrary vector v 7^ can not belong to "too many" repairing subspaces S ( . This 
observation leads to a bound on the number of nodes in an optimal access code. 

Theorem 13 Let C be an (k + r,k,l) optimal access MDS code with constant repairing subspaces S\, S^, then k ^ r log r I. 

Proof: Define a bipartite graph with one set of vertices to be the standard basis vectors e\, ...,e/. The second set of vertices 
will be the repairing subspaces S\, ...,Sfc. Define an edge between a vector e, and a subspace S; iff S; contains e,. Count in 
two different ways the number of edges in the graph. By the assumption the code is optimal bandwidth, hence each repairing 
subspace contains l/r standard basis vectors, and the degree of each repairing subspace in the graph is l/r. In total there are 
kl/r edges in the graph. However by Corollary [12] the degree in the graph of each standard basis vector is at most log r I. 
Hence there are at most I log r I edges in the graph, namely 

k- «S llo&l, 
r r 

and the result follows. ■ 
Corollary 14 Let k be the largest number of systematic nodes in an optimal access (k + r,k,l) MDS code, then 

k = rlog r I. 

Proof: The lower bound is derived by the codes constructed in H, 11201 . ■ 
Note that [ 201 constructed also an optimal bandwidth code with k = (r + l)log ; ,Z. Therefore, in the general case where 
we do not require an optimal update code, there is a difference between optimal access and optimal bandwidth code. Namely, 
these two properties are not equivalent (see Table |2). 

VI. DISCUSSION AND SUMMARY 

Assume that an MDS code over the field F is to be constructed. The capacity / of each node, which is the number of 
symbols it can store equals to 

/ M 



log\¥\' 

where A4 is the size in bits of the node, and log |F| is the number of bits takes to represent each symbol. In this paper we 
asked the following question: Given the number of parities r and the capacity /, what is the largest number of nodes k such 
that there exists an optimal bandwidth (resp. access) (k + r,k,l) MDS code. We used distinct combinatorial tools to derive 
3 upper bounds on k. The first bound considers the general case of optimal bandwidth code. The last two bounds are tight, 
and they consider optimal access and optimal update codes with diagonal encoding matrices. Moreover, we showed that in 
the general case, the properties of optimal bandwidth and optimal access are not equivalent, although in certain codes such as 
codes with diagonal encoding matrices, they are. It is an open problem what is the exact bound for optimal bandwidth code 
with r parities and capacity I. 

Since the capacity of each node is a function of the field size being used, one would like to minimize the field size in order 
to increase the capacity and therefore the number of nodes that can be protected. However, in order to satisfy the MDS property 
the field size needs to be large enough, e.g. it is well known that for optimal update codes the field F2 is not sufficient. It is 
an interesting open problem to determine the smallest field size sufficient for the MDS property. 
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Appendix A 
Proof of Theorem|2] 

Theorem |2] If there exists an optimal bandwidth (k + r,k,l) MDS code then there exists an optimal bandwidth (k + r 
l,k — 1,1) MDS code with constant repairing subspaces. 

Proof: Let the encoding matrices for the code in the hypothesis be 



A 



1,1 



A 



with repairing subspaces [S\ m , Si m , Sr,m) f° r node m. Namely, for any distinct m, m! E [k] the following holds 



Define the code 



where 



®i € [r] ^i,m^i,i 

' Qui 



Sr,mA r 



r,m' 



c = (c ;> ) 



c, 



C r ,k-1 



(28) 

(29) 
(30) 



r. — A , A ~1 A ■ A~^ 

Note that for C r/ln is the identity matrix for any m £ [k — 1], namely the last row in C is composed of identity matrices. We 
claim that this is an optimal bandwidth (k + r — l,k — 1,1) MDS code with constant repairing subspaces. 

Optimal Bandwidth Property: Assume node m 6 [k — 1] was erased, then use the set of repairing subspaces 

(Sin, S m ), 

where S m = S r/ , n . Namely transmit from parity node j the information S„ifl^+;- F° r the optimal bandwidth property we only 
need to show that (© is satisfied. Let m, m' E[k — 1] and ;' 6 [r] 



Sr,m C-j,m' 



5r,m-A r ^A-^ A^ m tA r m i 

s j l mAj ik A.^Aj im ,A r ^ n , 



S 1 _m A; m i A 



-1 

c. A . 4—1 



m = m 



where (l3TT l and (|32| | follow from 



[Sr,mA r m ! A r m , — S m else, 
Therefore, for m! ^ m 



(31) 



(32) 











rank 




— rank 






5 'mCf ,m l 







and ([8]) is satisfied. Moreover 



F — ffiy e [,-] 5j /m Aj iin 

® / 6 [r] S/',m ^/,m ^j-,m 
= ©;' G [r]SniCj /m 



(33) 
(34) 
(35) 



where d33l follows from (1301) . and (l34l follows since A r , m is an invertible matrix. (l35l l follows from (l3Zt . thus (|8) is also 
satisfied for m = m' . 

MDS Property: This property follows easily from the MDS code in (128t . The code C is MDS iff for any f 6 [r] and sets 
of indices ■■-,jt} C [r], {mi, nit} C [A: — 1] the block sub matrix 



C 



C 



7i< m f 



c 



c 



12 



is invertible. However, 



C 



C 



h> m t 



c c 



Ar,kAj^j < .Aj 1/mi A r J lll ... Ay i kA^j,Aj limi A r J nt 



A , A 4 — 1 A , A~^ A A~ 1 

"■rX^i^k ]uW-i"-r,mi ••• j tl k ]t,mt r,m t 



A r,k A U i 



1 



A j\,m\ ■■■ A j\,m t 



A 



(36) 



4-1 

r,mi 



A- 1 



Since each encoding matrix A; ; is invertible, the first and the third matrices in (l36l are invertible. The middle matrix is 
invertible since the code in (|28l l is invertible, and the result follows. ■ 
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